arXiv:1503.05395vl [math.PR] 18 Mar 2015 


Modem Stochastics: Theory and Applications 1 (2014) 195-209 
DOI: 10.15559/15-VMSTA19 


Testing hypotheses on moments by observations 
from a mixture with varying concentrations 


Alexey Doronin, Rostyslav Maiboroda 

Kyiv National Taras Shevchenko University, Kyiv, Ukraine 

al_doronin@ukr.net (A. Doronin), mre@univ.kiev.ua (R. Maiboroda) 

Received: 22 December 2014, Revised: 21 January 2015, Accepted: 22 January 2015, 

Published online: 4 February 2015 

Abstract A mixture with varying concentrations is a modification of a finite mixture model 
in which the mixing probabilities (concentrations of mixture components) may be different 
for different observations. In the paper, we assume that the concentrations are known and the 
distributions of components are completely unknown. Nonparametric technique is proposed 
for testing hypotheses on functional moments of components. 
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1 Introduction 

Finite mixture models (FMMs) arise naturally in statistical analysis of biological and 
sociological data [11, 13]. The model of mixture with varying concentrations (MVC) 
is a modification of the FMM where the mixing probabilities may be different for 
different observations. Namely, we consider a sample of subjects Oi,..., On where 
each subject belongs to one of subpopulations (mixture components) Pi,, Vm- 
The true subpopulation to which the subject Oj belongs is unknown, but we know the 
probabilities = P[Oj G Vm] (mixing probabilities, concentrations ofVm in the 
mixture at the jth observation, j = 1,..., N, m = 1,..., M). For each subject O, a 
variable ^(O) is observed, which is considered as a random element in a measurable 
space X equipped by a cr-algebra jj. Let 

Fm{A) = P[^[0)GA\OGVm], AG^S, 
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be the distribution of ^(O) for subjects O that belong to the mth component. Then 
the unconditional distribution of = is 

M 

G ^] = Pj^,N^rn{A), A G (1) 

m—1 

The observations are assumed to be independent for j = 1,... ,N. 

We consider the nonparametric MVC model where the concentrations ^re 
known but the component distributions Fm are completely unknown. Such models 
were applied to analyze gene expression level data [8] and data on sensitive ques¬ 
tions in sociology [12]. An example of sociological data analysis based on MVC is 
presented in [9]. In this paper, we consider adherents of different political parties in 
Ukraine as subpopulations Vi- Their concentrations are deduced from 2006 parlia¬ 
ment election results in different regions of Ukraine. Individual voters are considered 
as subjects; their observed characteristics are taken from the Four-Wave Values Sur¬ 
vey held in Ukraine in 2006. (Note that the political choices of the surveyed individ¬ 
uals were unknown. So, each subject must be considered as selected from mixture of 
different Vi.) For example, one of the observed characteristics is the satisfaction of 
personal income (in points from 1 to 10). 

A natural question in the analysis of such data is homogeneity testing for different 
components. For example, if X = R, then we may ask if the means or variances (or 
both) of the distributions Fi and Fk are the same for some fixed i and k or if the 
variances of all the components are the same. 

In [8], a test is proposed for the hypothesis of two-means homogeneity. In this 
paper, we generalize the approach from [8] to a much richer class of hypotheses, 
including different statements on means, variances, and other generalized functional 
moments of component distributions. 

Hypotheses of equality of MVC component distributions, that is. Ft = Fk, were 
considered in [6] (a Kolmogorov-Smirnov-type test is proposed) and [1] (tests based 
on wavelet density estimation). The technique of our paper also allows testing such 
hypotheses using the “grouped x^”-approach. 

Parametric tests for different hypotheses on mixture components were also con¬ 
sidered in [4, 5, 13]. 

The rest of the paper is organized as follows. We describe the considered hypothe¬ 
ses formally and discuss the test construction in Section 2. Section 3 contains auxil¬ 
iary information on the functional moments estimation in MVC models. In Section 4, 
the test is described formally. Section 5 contains results of the test performance anal¬ 
ysis by a simulation study and an example of real-life data analysis. Technical proofs 
are given in Appendix A. 

2 Problem setting 

In the rest of the paper, we use the following notation. 

The zero vector from is denoted by Ofe. The unit k x A:-matrix is denoted 
by Ifcxfc^ and the k x m-zero matrix by Okxm- Convergences in probability and in 

P d 

distribution are denoted —and —respectively. 
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We consider the set of concentrations p = = rn = 1,... ,M-, 

-/V = 1,...) as an infinite array, p[.j^ = J = 1; • ■ •) m = 1,..., M) as an 
(TV X m)-tnatrix, and p^j^ = = 1,...,N) G and p'-m = 

1,..., M) as column vectors. The same notation is used for arrays of similar struc¬ 
ture, such as the array a introduced further. 

Angle brackets with subscript N denote averaging of an array over all the obser¬ 
vations, for example, 

1 N 

T=i 

Multiplication, summation, and other similar operations inside the angle brackets are 
applied to the arrays componentwise, so that 


/ m k \ 

{a.-NP.-N/N - 




N ^ 

j=i 


i=i 


and so on. 

Angle brackets without subscript mean the limit of the corresponding averages as 
N ^ oo (assuming that this limit exists): 

We introduce formally random elements rim G X with distributions Fm, m = 1,..., M. 

Consider a set of AT < M measurable functions pfc : X —>■ k = 1,..., AT. 

Let be the (vector-valued) functional moment of the mth component with moment 
function gk, that is, 

:=E[gfe(p™)] (2) 

Fix a measurable function T : x x • • • x —>• For data described 

by the MVC model (1) we consider testing a null-hypothesis of the form 

i/o:r(pi,...,5f) =Ol (3) 

against the general alternative T{gl,..., g^) ^ 

Example 1. Consider a three-component mixture (M = 3) with X = R. We would 
like to test the hypothesis Hq : Varpi = Var ?72 (i-e., the variances of the first 
and second components are the same). This hypothesis can be reformulated in the 
form (3) by letting Pi(x) = g 2 {x) = {x,x^Y and r((pii, j/ 12 )^, (p 2 i, 2 / 22 )^) = 
{vi2 - {yiif,y22 - {y2if)'^. 

Example 2. Let X = R. Consider the hypothesis of mean homogeneity Hq : E pi = 

• • • = EpM- Then the choice of gm{x) = x, T(iii,.. .,yM) = {yi - 2 / 2 , 2/2 - 
2 / 3 ,..., 2/M-i - 2 /m)^ reduces to the form (3). 

Example 3. Let X be a finite discrete space: X = {xi,... ,Xr}. Consider the distri¬ 
bution homogeneity hypothesis : Fi = F 2 . To present it in the form (3), we can 
usepj{a;) = {l{x = Xi},k = l,...,r- 1)^ andT(j/i,y2) = 2/i - 2/2 (2/* G 
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for i = 1, 2). In the case of continuous distributions, Hq can be discretized by data 
grouping. 

To test iTo debned by (3), we adopt the following approach. Let there be some 
consistent estimators 5 ^^ for 5 ™. Assume that T is continuous. Consider the statistic 

TV = ..., Qx-n)- Then, under Hq, TV ~ Ol, and a far departure of TV 

from zero will evidence in favor of the alternative. 

To measure this departure, we use a Mahalanobis-type distance. If y/NTx is 
asymptotically normal with a nonsingular asymptotic covariance matrix D, then, un¬ 
der Hq, NT'^D~^Tm is asymptotically -distributed. In fact, D depends on un¬ 
known component distributions Ti, so we replace it by its consistent estimator Dm- 
The resulting statistic sm = NT^DJ^^Tm is a test statistic. The test rejects Hq if 
sn > (1 — Of), where a is the signihcance level, and Q^{a) denotes the quantile 

of level a for distribution G. 

Possible candidates for the role of estimators and Dm are considered in the 
next section. 

3 Estimation of functional moments 

Let us start with the nonparametric estimation of TV by the weighted empirical dis¬ 
tribution of the form 

1 ^ 
i=i 

where are some nonrandom weights to be selected “in the best way.” Denote 

Cm = (l{fc = m}, T = 1 ,..., MY' and 

P'-,n = {{p'^nP^-n)m) m,i=l- 

Assume that TV is nonsingular. It is shown in [ 8 ] that, in this case, the weight array 

~ P-;nFm ^rn 

yields the unbiased estimator with minimal assured quadratic risk. 

The simple estimator g^j^ for p™ is debned as 

f - 1 ^ 

9i-,N — / 9iY)Fm-,N{dx) = - ''^^aj.MgiYj'.N)- 

We denote D = lim 7 v_^oo TVr = i{p^pY)iin=i- Let h : X —?> be any measurable 

function. 

Theorem 1. ([9], Lemma 1) Assume that: 

(i) r exists, and det D ^ 0 ; 

(ii) E[\h{r],n)\] < 00 , m = 

Then ^ ^[hiPm)] as N ^ 00 for all m = 1,..., M. 
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To formulate the asymptotic normality result for the simple moment estimators, 
we need some additional notation. 

We consider the set of all moments g^, k = 1, ..., K, as one long vector belong¬ 
ing to d := di -\ - -I- dx- 

(4a) 

The corresponding estimators also form a long vector 

9n'■= {{9i-n) ^■■■^{9k-,n) ) G (4b) 

We denote the matrices of mixed second moments of gk{x), k = 1,... ,K, and the 
corresponding estimators as 

5 m := G A:,( = 1,.. .,iT, m = 1,... ,M; (5a) 

1 ^ 

k,l = l,...,K. (5b) 


We consider the function T as a function of d-dimensional argument, that is, 
T{y) := T{y\ ..., y^). Then Tm := T{gx) = T{g \.^,..., pf 

Let us define the following matrices (assuming that the limits exist): 


^r,s;N 



(6a) 



( r,s=l, 

\N^oo ’ ' / k,l=l,...,K 

...,M 




(6b) 

Pm;N ■ 

= {l^in-N) k,l=l.....K ■“ 

{{fNa\-NP'^N)x)k,l=Tji ^ 

(7a) 

fim ■■ 

jl 

Jl 

s 

II 

'lim/3^0 

.N^oo ’ / k,l=l,...,K 

M. 


(7b) 


Then the asymptotic covariance matrix of the normalized estimate {g N - g) 

is S, where S consists of the blocks 

M M 

^ ^ a>;:igl{gtf G (8a) 

m—1 r,s—l 

Theorem 2. Assume that: 

(i) The functional moments g]f, g'f'i exist and are finite for k,l = 1,..., K, m = 

(ii) There exists <5 > 0 such that < oo, k = 1, ..., K, m = 

(iii) There exist finite matrices C, r~^, ar,s, and fimfor r,s,m = 1, •.., M. 

Then s/N{ gx — g) —M C ~ Af{Od, SJ), N ^ oo. 
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Thus, to construct a test for Hq, we need a consistent estimator for E. The ma¬ 
trices ar,s;Af and Prn-,N are natural estimators for and jSm- It is also natural to 
estimate by defined in (5b). In view of Theorem 1, these estimators are 

consistent under the assumptions of Theorem 2. But they can possess undesirable 
properties for moderate sample size. Indeed, note that Fm-,N is not a probability dis¬ 
tribution itself since the weights are negative for some j. Therefore, for example, 
the simple estimator of the second moment of some component can be negative, es¬ 
timator (5b) for the positive semidefinite matrix can be not positive semidefinite 
matrix, and so on. Due to the asymptotic normality result, this is not too troublesome 
for estimation of g. But it causes serious difficulties when one uses an estimator of 
the asymptotic covariance matrix D based on in order to calculate sn- 

In [10], a technique is developed of Fm-,N and /i™ improvement that allows one 
to derive estimators with more adequate finite sample properties if X = M. 

So, assume that ^(O) G K and consider the weighted empirical CDF 

1 ^ 

Fm\N{x) = — ^ ^ < x}. 

It is not a nondecreasing function, and it can attain values outside [0,1] since some 
a^jY are negative. The transform 

=SnpF^.N{y) 

V<x 


yields a monotone function Fjn-N{x), but it still can be greater than 1 at some x. So, 
define 

= min{l,F+^(a:)} 

as the improved estimator for Fm{x). Note that this is an “improvement upward,” 
since F^.j^{x) > Fm-,N{x). Similarly, a downward improved estimator can be de¬ 
fined as 


= inf Pm-N{y), 

y>x 

= niax{0,F“.jY(a;)}. 

Any CDF that lies between F^.j^{x) and Fl^.j^{x) can be considered as an im¬ 
proved version of Fm-,N{x). We will use only one such improvement, which combines 

An A ™dF'+.^(x): 

\A-nA 
A-,nA = { A-nA 
11/2 


< l/2> 

(9) 

otherwise. 


Note that all the three considered estimators F^.j^ (* means any symbol from -f, 
, or ±) are piecewise constants on intervals between successive order statistics of 
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the data. Thus, they can be represented as 

1 ^ 

i=i 

where are some random weights that depend on the data. 

The corresponding improved estimator for is 

9i^N = / 9^ix)F^.Nidx) = br,N9^i^r,N)■ 

J-oo 

Let h : R —^ R be a measurable function. 

Theorem 3. Assume that F exists and det F ^ 0. 

(I) If for some c_ < c+, c_ < rjm < c+for all m = 1,..., M and h has bounded 
variation on (c_, c+), then h^* —>■ /i™ a.s. as N ^ oo for all m = 1,..., M 
and * S {+, —, ±}- 

(II) Assume that: 

(i) For some ^ > 0, E[|/i(?7m)P~'’^] < oo. 

(ii) h is a continuous function of bounded variation on some interval [c_, c+] 
and monotone on (—oo, c_] and [c+, +(X)). 

Then Kjf —>■ h™ in probability. 


4 Construction of the test 


We first state an asymptotic normality result for T^v. Denote 

T'W:= 

Theorem 4. Assume that: 

(i) T'{g) exist. 

(ii) The assumptions of Theorem 2 hold. 

(hi) The matrix D = T'{g)F{T'{g))'^ is nonsingular. 

Then, under Hq, ^ N{Ol, D). 

For the proof, see Appendix. Note that (iii) implies the nonsingularity of E. 
Now, to estimate D, we can use 


DN = T'CgN)EN{T'igN)) , 
where g^ is any consistent estimator for g, 


M 


M 


■■= E P^fN9Ki;N - E f-iN9lA9l,Nf ^ dOa) 


m—1 


y' _/ 

En .- [E^ ) 


r,s—l 
^ mdxd 


kd=l . K 


where g^i.j.^ is any consistent estimator for For example, we can use 


(10b) 
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1 

~9ti;N = = bT:^9k{ir,N)9i{^mf 

i=i 

if X = M and the assumptions of Theorem 3 hold for all h{x) = g\{x)g'^{x), i,k = 
1,..., K, i = 1,... ,di, n = 1,... ,dk, gi{x) = {gfix),- ■ .,gf‘{x))'^. 

Now let the test statistic be sjv = For a given significance 

level a, the test 7rjv,a accepts Hq if sjv < (1 — a) and rejects Hq otherwise. 

The p-level of the test (i.e., the attained significance level) can be calculated as 
p = 1 — G{s]y), where G means the CDF of x|-distribution. 

Theorem 5. Let the assumptions of Theorem 4 hold. Moreover, assume the following: 

(i) giq and g'/fi-j^ {k,l = 1,..., K, m = 1,..., M) are consistent estimators for 
g and g^i-jy, respectively. 

(ii) T' is continuous in some neighborhood ofg. 

Then limAr->oo Pffo{^JV,a rejects Hq} = a. 

Example 2 (Continued). Consider testing Hq by the test with gi{x) = x and 
r(yi, ..., pm) = {yi - y 2 , y 2 - ya, ■ ■ ■, Vm-i - VmY ■ It is obvious that T'{y) is 
a constant matrix of full rank. Assume that Var[77m] > 0 for all m = 1,... ,M and 
det r {). Then E is nonsingular, and so is D. Thus, in this case, assumptions (i) 
and (iv) of Theorem 2, (i) and (iii) of Theorem 4, and (ii) of Theorem 5 hold. 

To ensure assumption (ii) of Theorem 2, we need < oo for some 

<5 > 0 and all m = 1,..., M. In view of Theorem 1, this assumption also implies the 
consistency of and If one uses g^ and as estimators yjv and y^.jy in 

Diq, then a more restrictive assumption E[|r 7 m|^+'^] < oo is needed to ensure their 
consistency by Theorem 3. 

5 Numerical results 

5.1 Simulation study 

To access the proposed test performance on samples of moderate size, we conducted 
a small simulation study. Three-component mixtures were analyzed (M = 3) with 
Gaussian components ^ <^m)- The concentrations were generated as 

= Cyj.f/sj.M 7 where are independent, uniformly distributed on [0,1] ran¬ 
dom variables, and Sj-^N = CpV- 1*^ experiments, 1000 samples were 

generated for each sample size N = 50, 100, 250, 500, 750, 1000, 2000, and 5000. 
Three modifications of test were applied to each sample. In the first modifica¬ 
tion, (ss), simple estimators were used to calculate both Tjv and D^v. In the second 
modification, (si), simple estimators were used in TV, and the improved ones were 
used in Dm. In the last modification (ii), improved estimators were used in TV and 
Dm. Note that the modification (ii) has no theoretical justification since, as far as we 
know, there are no results on the limit distribution of \/N (y^ — y). 

All tests were used with the nominal significance level a = 0.05. 
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Fig. 1. Testing homogeneity of means (Hq ) 


In the figures, frequencies of errors of the tests are presented. In the plots, □ 
corresponds to (ss), A to (si), and o to (ii) modification. 

Experiment Al. In this experiment, we consider testing the mean homogeneity hy¬ 
pothesis Hq. The means were taken = 0, m = 1,2,3, so Hq holds. To shadow 
the equality of means, different variances of components were taken, namely af = 1, 
(t| = 4, and (t| = 9. The resulting first-type error frequencies are presented on the 
left panel of Fig. 1. For the (ss) test, for small N, there were 1.4% cases of incorrect 
covariance matrix estimates {Dn was not positive definite). Incorrect estimates were 
absent for large N. 

Experiment A2. Here we also tested for components with the same variances as 
in Al. But fii = 2 and /i 2 = /is = 0, so Hq does not hold. The frequencies of the 
second-type error are presented on the right panel of Fig. 1 . The percent of incorrect 
estimates Djyi is 1. 6 % for (ss) and small N. 

Experiment Bl. In this and the next experiment, we tested Hq-. = cr|. The data 
were generated with pi = 0, ^2 = 3, ^3 = —2, crj = = 1, and cr| = 4, so 

Hq holds. The frequencies of the first -type error are presented on the left panel of 
Fig. 2. The percent of incorrect Djg in (ss) varies from 19.4% for small N to 0% for 
large N. 

Experiment B2. Now and cTq are the same as in Bl, but cr^ = 1 and (t| = 4, 
so Hq does not hold. The frequencies of the second-type error are presented on the 
left panel of Fig. 2. The percent of incorrect in (ss) was 15.5% for small N and 
decreases to 0% for large N. 

The presented results show reasonable agreement of the observed significance 
levels of the tests to their nominal level 0.05 when the sample sizes were larger then 
500. The power of the tests increases to 1 as the sample sizes grow. It is interesting to 
note that the (ii) modification, although theoretically not justified, demonstrates the 
least first-type error and rather good power. From these results the (si) modification 
of the test seems the most prudent one. 

5.2 Example of a sociological data analysis 

Consider the data discussed in [9]. It consists of two parts. The first part is the data 
from the Four-Wave World Values Survey (FWWVS) held in Ukraine by the Euro¬ 
pean Values Study Foundation (www.europeanvalues.nl) and World Values Survey 
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Fig. 2. Testing equality of variances (Hq ) 


Table 1. Means (^) and variances (ct^) for the subjective income distribution on different po¬ 
litical populations 



PR 

OC 

Other 


2.31733 

2.65091 

4.44504 

p+ 

2.45799 

2.64187 

4.44504 


0.772514 

4.85172 

4.93788 

a2+ 

2.09235 

4.7639 

4.93788 


Association (www.worldvaluessurvey.org) in 2006. They contain answers of N = 
4006 Ukrainian respondents on different questions about their social status and atti¬ 
tudes to different human values. We consider here the level of satisfaction of personal 
income (subjective income) as our variable of interest so is the subjective 
income of the jth respondent. 

Our aim is to analyze differences in the distribution of ^ on populations of ad¬ 
herents of different political parties. Namely, we use the data on results of Ukrainian 
Parliament elections held in 2006. 46 parties took part in the elections. The voters 
could also vote against all or not to take part in the voting. We divided all the popula¬ 
tion of Ukrainian voters into three large groups (political subpopulations): Vi which 
contains adherents of the Party of Regions {PR, 32.14% of votes), 1^2 of Orange 
Coalition supporters (OC which consisted of “BJUT” and “NU” parties, 36.24%), 
and V 3 of all others, including the persons who voted against all or did not take part 
in the pool {Other). 

Political preferences of respondents are not available in the FWWVS data, so we 
used official results of the elections by 27 regions of Ukraine (see the site of Ukrainian 
Central Elections Commission www.cvk.gov.ua) to estimate the concentrations 
of the considered political subpopulations in the region where the jth respondent 
voted. 

Means and variances of ^ over different subpopulations were estimated by the data 
(see Table 1). Different tests were performed to test their differences. The results are 
presented in the Table 2. Here means the expectation, and cr^ means the variance 
of ^ over the mth subpopulation. Degrees of freedom for the limit distribution are 
placed in the “df” column. 

These results show that the hypothesis of homogeneity of all variances must be 
definitely rejected. The variances of ^ for PR and OC adherents are different, but the 
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Table 2. Test statistics and p-values for hypotheses on subjective income distribution 


Hypotheses 

ss 

si 

ii 

df 

b-V = P2 = b3 

11.776 

10.8658 

8.83978 

2 

p-value 

0.00277252 

0.0043704 

0.0120356 


Ail = Ai2 

2.15176 

2.04539 

0.621483 

1 

p-value 

0.142407 

0.152668 

0.430497 


Ail = /i3 

10.7076 

10.0351 

8.75216 

1 

p-value 

0.00106696 

0.00153585 

0.00309236 


b2 = b3 

7.40835 

7.10653 

7.17837 

1 

p-value 

0.00649218 

0.00768036 

0.00737877 


af = (7^ = al 

15.8317 

14.786 

6.40963 

2 

p-value 

0.000364914 

0.000615547 

0.0405664 


0-2 = 0-2 

14.7209 

13.8844 

5.95528 

1 

p-level 

0.000124657 

0.000194405 

0.0146733 


of = 032 

1.92166 

1.77162 

0.826778 

1 

p-level 

0.165674 

0.183182 

0.363206 


^2 = <^3 

0.000741088 

0.00072198 

0.00294353 

1 

p-level 

0.978282 

0.978564 

0.956733 



tests failed to observe significant differences in the pairs of variances PR-Other and 
OC-Other. For the means, all the tests agree that PR and OC has the same mean 
whereas the mean of Other is different from the common mean of PR and OC. 


6 Concluding remarks 

We developed a technique that allows one to construct testing procedures for differ¬ 
ent hypotheses on functional moments of mixtures with varying concentrations. This 
technique can be applied to test the homogeneity of means or variances (or both) of 
some components of the mixture. Performance of different modifications of the test 
procedure is compared in a small simulation study. The (ss) modification showed the 
worst first-type error and the highest power. The (ii) test has the best first-type error 
and the worst power. It seems that the (si) modification can be recommended as a 
golden mean. 

Acknowledgement 

The authors are thankful to the anonymous referee for fruitful comments. 

A Appendix 

Proof of Theorem 2. Note that Sn = y/N{gN — g) = where 

Ci;tV = = 1,.. 

We will apply the CLT with the Lyapunov condition (see Theorem 11 from Chapter 8 
and Remark 4 in Section 4.8 in [2]) to Sn- It is readily seen that Cj-.N, j = , N, 

are independent for fixed N and E[^j.jv] = 0. 
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Let Sj-M = Cov(Cj;Ar). Then Sj-^N consists of the blocks 

/ M / M 

“ f ^ Pj-,N9T,1 ~ f ^ Pj-.Ng'k 

\m=l \m=l 

It is readily seen that 'Yl!j=i as N ^ oo. So 

CovS’at —>■ L' as iV —>• oo. (11) 



Pj-,N 9 i 


To apply the CLT, we only need to verify the Lyapunov condition 

N 

E [ I Cj;Ar I ^'''^] —0 for some 5 > 0. 
i=i 

Note that assumption (iii) implies 

sup \a^J<Ci 

l<j<N,l<<m<M,N>No 


for some jVq and Ci 

N 




Ee[|0;^I 

7 = 1 


| 2 + 5 - 


N 


f<2+S 


( 12 ) 


(13) 


where 5 (a;) = (pi(ai)^,..., pfc(a;)'^). Since |p(Ci;jv)P = Ef=i and, by 

the Holder inequality, 


k^l 

we obtain 

k^l 
K M 

= E E pT,n E \9{Vm)r' < C 2 < oo, 

k—1 m—1 

where the constant C 2 does not depend on j and N. This, together with (13), yields 
(12). By the CLT we obtain Sn N{0, S). □ 

Proof of Theorem 3. Part (I). Since F^.j^{x) is piecewise constant and Fm{x) is 
nondecreasing, the sup,j, of \F^.j^{x) — Fm{x)\ can be achieved only at jump points 
of F^.j^{x). But F^.^{x) > Fm-iq{x) for all x, and if x is a jump point, then 

Fm;N{x—) < F^.j^{x—) < F^.j^{x) < Fm;N{x). 
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Therefore, 

sup “ Fra{x)\ < SUp |Fm;Ar(a:) - F^{x)\. 

Similarly, 

sup \F^.j^{x) - Fmix)\ < sup \Fm-,Nix) - F^(x)\. 

By the Glivenko-Cantelli-type theorem for weighted empirical distributions (which 
can be derived, e.g., as a corollary of Theorem 2.4.2 in [7]) 

sup \ Fm-N{x) — Fm{x)\ —>■ 0 a.s. as N ^ oo 


if supj^i \^J-n\ < o®- The latter condition is fulfilled since detT ^ 0. 

Thus, 

sup — Fm(x)| —^ 0 a.s. as iV —>■ oo. (14) 

xGR 

For any /i : M —M and any interval ACM, let VA{h) be the variation of h on A. 
Take A = (c_, c+). Then, under the assumptions of the theorem. 


Tm* _ 7 _ 

'^ra\ — 


Kx)d{F^.N{x) - FAx)) 


< sup \Fm-Nix) - Fm{x) \ • Vl(^) 0 

xeA 


a.s. as N ^ oo. 


Part (II). Note that if the assumptions of this part hold for some A = (c_, c+), 
then they will also hold for any new c_, c+ such that A C (c_, c+). Thus, we may 
assume that Fm(c_) < l/4andFm(c+) > 3/4. 

Consider the random event Bjf = {Ffj^{x) = F^.pfix) for all x < c_}. Then 
(14) implies P{i?^} —>■ 1 as iV —>■ oo. 

We bound 

\h^^-h^\<Ji + J2 + J3. (15) 

where 


J 2 

Ja 


L 

r 

' c_ 
^+oo 


Kx)d{Ff A^) - FAx)) 
Hx)d{Ff^{x) - Fm{x)) 
Hx)d{Ff^{x) - FAx)) 


Then J 2 —^ 0 as in Part (1). 

Let us assume that the event occurred and bound 
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If h{x) is bounded as a: —>■ —oo, then we can take c_ = —oo and obtain Ji = 0. 
Consider the case of unbounded h. Since h is monotone, we have h{x) —^ +oo or 
h{x) +00 as a; — —oo. We will consider the first case; the reasoning for the 
second one is analogous. Thus, h{x) —>■ +oo as x —>■ —oo, and we can take h{x) > 0 
for X < C-. 

By the inequality (16) in [10], 


sup 

t<X 


Fl-Nit)-Fmit)\>e <Ci{F\x)e-^N-^+F{x)e-^N-^), (17) 


where F{x) = Fm{x), Ci < oo. 

Let us take Xg, ■ ■ ■, ... such that h{xj) = 2^h{c-). By assumption (ii) and 

the Chebyshev inequality, 

M M 

F{x) = ^ P[r 7 m < a;] < ^ E , 

m—\ m—1 

and 

F{xj) < L)2-(2+')'b 

for some D < oo. 

Let Ej = 2-(i+^/4)i7V-i/4. Then by (17) 


sup \F+^Nit) - Fn.it)\ > S,] < C2{2-^^N-^ + 2-'ra/2^-i/2^ 




for some C 2 < 00 . Denote = njlsup^^^,^. — Fm{t)\ < Sj}. Then 

P[Bn] > 1 -'^C2{2-'^^N-^ +2-'^F2 j^-i/2 ^ > X _ CgN-^ - C4N-^/^ 1 

i=i 


as iV -)■ 00 . Now, Ji = I Hx)diF.^.j^(x) - F„(x))|. If Blf and B% occur, 
then 


Ji < 


N 


“ F^{x)\h{dx) 

j—0 ^j + 1 


N 


< - 2 ^) < C5N-F^. 

3=0 


Thus, P[Ji < CsiV-i/^] > P[Bl, nBlf]^l and Ji -A 0. 

Similarly, J 3 0. 

Combining these bounds with (15), we accomplish the proof. □ 

Proof of Theorem 4. This theorem is a simple corollary of Theorem 2 and the con¬ 
tinuity theorem for weak convergence (Theorem 3B in Chapter 1 of [3]). □ 
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Proof of Theorem 5. Since and are consistent, Z'jv —> S. Similarly, 

p 

the continuity of T' and consistency of imply T'{gN) —^ T'{g). Then, with 
det Z? ^ 0 in mind, we obtain ^ D~^. 

Denotesjv = By Theorem 4 and the continuity theorem,SAr xi- 

By Theorem 4 ^/NTn is stochastically bounded. Thus, 

\Sn-sn\ = |\/]vf^(ZZ-i-.D^i)(^fiv)| Ao. 

Therefore, sn converges in distribution to the same limit as sn, that is, to xi- CH 
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